Semantic Suffix Net Clustering for Search Results
نویسندگان
چکیده
Suffix Tree Clustering (STC) uses the suffix tree structure to find a set of snippets that share a common phrase and uses this information to propose clusters. As a result, STC is a fast incremental algorithm for automatic clustering and labeling but it cannot cluster semantically similar snippets. However, the meaning of the words is indeed an important property that relates them to other words, although there may not be a match of text strings per se. In this paper, we propose a new semantic search results clustering algorithm, called semantic suffix net clustering (SSNC). It is based on semantic suffix net structure (SSN). The proposed algorithm uses the net pruning technique to merge the related suffixes through their suffix links for finding base clusters. This logic causes both string matching and meaning of the words to be used as conditions for the purpose of clustering. Experimental results show that the proposed algorithm has time complexity lower than CFWMS, SSTC and STC+GSSN which are current semantic search results clustering methods. Moreover, the F-measure of the proposed algorithm is similar to that of the original STC, CFWMS, STC+GSSN, and higher than that of MSRC and SSTC.
منابع مشابه
Improving Web Search Results Using Semantic Clustering
This paper consider the problem of search engine that are not capable of retrieving appropriate result on query given. Most of the users are not able to give the appropriate query to get what exactly they wanted to retrieve. So the search engine retrieves a massive list of data, which are ranked by the page rank algorithm or relevancy algorithm or human judgment algorithm. If the relevant resul...
متن کاملSemantic Suffix Tree Clustering
This paper proposes a new algorithm, called Semantic Suffix Tree Clustering (SSTC), to cluster web search results containing semantic similarities. The distinctive methodology of the SSTC algorithm is that it simultaneously constructs the semantic suffix tree through an on-depth and on-breadth pass by using semantic similarity and string matching. The semantic similarity is derived from the Wor...
متن کاملClustering of Web Search Results Using Semantic
Clustering is related to data mining for information retrieval. Relevant information is retrieved quickly while doing the clustering of documents. It organizes the documents into groups; each group contains the documents of similar type content. Different clustering algorithms are used for clustering the documents such as partitioned clustering (K-means Clustering) and Hierarchical Clustering (...
متن کاملA semantics-based method for clustering of Chinese web search results
Information explosion is a critical challenge to the development of modern information systems. In particular, when the application of an information system is over the Internet, the amount of information over the web has been increasing exponentially and rapidly. Search engines, such as Google and Baidu, are essential tools for people to find the information from the Internet. Valuable informa...
متن کاملSemantic, Hierarchical, Online Clustering of Web Search Results
Today, search engine is the most commonly used tool for Web information retrieval, however, its current status is still far from satisfaction. This paper focuses on clustering Web search results in order to help users find relevant Web information more easily and quickly. The main contributions of this paper include the following. (1) The benefits of using key phrases as natural language inform...
متن کامل